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ABSTRACT 


A  recently  proposed  iterative  thresholding  scheme 
turns  out  to  be  essentially  the  well-known  ISODATA 
clustering  algorithm,  applied  to  a  one-dimensional 
feature  space  (the  sole  feature  of  a  pixel  is  its  gray 
level).  We  prove  that  in  one  dimension,  ISODATA  always 
converges.  We  also  apply  it  to  requantize  images  into 
specified  numbers  of  gray  levels. 
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1.  Introduction 

An  iterative  method  for  threshold  selection  has  been  proposed 
by  Ridler  and  Calvard  in  111  for  object-background  discrimination. 
After  an  initial  guess  is  made/  at  each  iteration  we  get  a  new 
threshold  in  the  following  way:  given  a  threshold  T.^  the  next 
threshold  Ti+1  is  the  average  of  and  VbelQW,  where  Vafaove 

is  obtained  by  integrating  all  points  above  TV  and  Vbelow  by 
integrating  all  points  below  T^.  Ti+1  is,  hopefully,  a  better 
threshold  than  for  object-background  discrimination.  The 
process  terminates  as  soon  as  we  have  T^+^=T^,  which  usually  re¬ 
quires  about  four  iterations.  The  initial  value,  Tq,  is  chosen 
by  selecting  a  region  in  the  image  (its  four  corners)  that  is 
most  likely  to  contain  only  points  of  the  same  class-background. 

The  method  is  applied  to  thresholding  a  low-contrast  image  con¬ 
taining  a  handwritten  signature. 

Except  for  the  choice  of  the  initial  threshold,  the  method 
described  in  [1]  can  be  thought  of  as  a  one-dimensional  application 
of  the  ISODATA  algorithm  (as  described  in  [2])  where  we  restrict 
the  number  of  classes  to  two.  (In  the  ISODATA  algorithm  what  is 
initially  chosen  are  the  means  O^****^01^  classes.) 

In  this  note  we  give  some  other  applications  of  the  ISODATA 
algorithm  in  which  we  consider  numbers  of  classes  other  than  two. 

It  is  also  proved  that  the  algorithm  for  the  one-dimensional, 
two-class  case  always  converges. 


2.  The  ISODATA  algorithm 


The  basic  ASODATA  algorithm  [2]  is  a  procedure  for  classify¬ 
ing  a  set  of  sample  vectors  x  “(x^Xj, . . . ,xm)  into  c  distinct 
classes. 

Algorithm  2.1  Basic  ISODATA. 

1.  Choose  some  initial  values  for  the  means  y^,y2,. • • 

Loop: 2.  Classify  the  m  samples  by  assigning  them  to  the  class 
having  closest  mean; 

3.  Recompute  the  means  as  the  averages  of  the  samples  in 
each  class, 

4.  If  any  mean  has  changed  value,  go  to  loop;  otherwise 
stop. 

In  our  case,  we  have  an  image  in  which  each  point  has  a 
"gray  level"  integer  value  in  the  interval  [0,Lj.  Each  sample 
is  thus  a  point  of  the  image.  The  distribution  of  gray  levels 
is  given  by  a  histogram  h,  where  h(0),  h ( 1) , . . . ,h(L)  are  the 
numbers  of  points  with  gray  levels  0,1,... L.  Let  [LO,UP]  be  the 
smallest  interval  containing  all  non-zero  histogram  values.  In 
this  one-dimensional  case,  the  ISODATA  algorithm  may  be  rewritten 
as :  v 

Algorithm  2.2  One-dimensional  ISODATA: 

1;  Choose  ^ome  initial  values  for  the  means  y^,y2,...yc, 
such  that  LO*y^<y2<. . .  <yc*UP; 

Loop  2.  Calculate  thresholds  T^,T2, *  * *Tc-l  the  formula: 

Ti  =  l  (Mj+y^)/^; ,  l*i<c; 

Assign  to  class  i,  l*i*c,  all  gray  levels  in  the  interval 
Ii*[Ti_1+l,Ti] ;  (we  define  TQ-LO-l  and  Tc"0P) 


In  step  (3)  of  the  algorithm,  if  it  happens  that  l  h(j)=0 

for  some  i,  in  other  words,  there  is  no  point  whose  gray  level 
falls  in  the  interval  1^,  one  should  suppress  class  i  and  con¬ 
sider  just  the  remaining  classes. 

For  the  two-class  case  we  can  show  that  Algorithm  2.2  con¬ 
verges.  The  proof  reveals  how  Algorithm  2.2  works.  First, 
however,  it  is  necessary  to  prove  two  lemmas. 

Lemma  2.1 

1c  Jc 

Let  y^  and  y2  the  means  of  classes  1  and  2  respectively, 

Jr 

and  T  the  threshold  at  the  kth  iteration.  Then  we  have 
L0*Mj*Tk<U2*UP. 

Proof : 

For  k=0  we  have,  by  Step  1  of  the  algorithm,  that 
L0*pJ<M2*UP.  As  T°  =  l(yJ+y2)/2j,  then  yJ*T°<y°. 

Let  us  suppose  that  the  inequality  is  true  for  the  (k-l)st 


iteration.  The  new  values  for  the  means  are  based  on  the  inter 

...  1  »  T  k  1_  frn  mk—  1  .  _ 3  *k—  1 _  r  mk*“l  ,  -  ...  1  .. _ t- _  k  ^  _  k  —  1 


and  y2€l2  '  which  implies  that  LO*y*<y**UP.  Again 

T1*! (yk+yk)/2J  we  have  yk*Tk<yk.  //  ^ 


Lemma  2 . 2 

If  T1*Ti+1  then  Ti+1£Ti+2 .  Similarly,  if  TiaTi+1  then 
Ti+1*Ti+2. 

Proof: 

We  prove  only  the  first  part  of  Lemma  2.2,  as  the  second 
part  is  analogous  to  the  first. 

If  Ti=Ti+1,  then  u*+1  =  yj+2  and  y^*1  -  Uj+2  and  thus 
Ti+1  ■  Ti+2. 

If  Ti<Ti+1,  the  interval  I*+1  where  y^+2  will  be  calculated 
is  the  union  of  the  intervals  ij^  and  [T*+1,T*+^] .  So,  besides 
the  points  of  1^  that  contributed  to  y^+^,  we  have  the  new 
gray  levels  of  [Ti+l,Ti+1)  that  are  larger  than  y^+^.  Therefore 
y^+2ay^+1.  Similarly,  the  interval  I*  where  M^+1  is  calculated 
is  the  union  of  [T*+1,T*+^]  and  an<*»  therefore,  y^*2*^*^’ 

Since  both  means  y^+2  and  y^+2  are  lar9er»  we  have  Ti+2*Ti+1.  // 

Lemma  2.2  shows  that  the  threshold,  if  it  moves,  moves  only 
in  one  direction. 


Theorem  2 . 1 

Algorithm  2.2  converges  in  a  finite  number  of  steps. 

Proof: 

By  Lemma  2.2,  the  sequence  T°,T*,...  of  thresholds  forms 
either  a  non-decreasing  or  non-increasing  sequence.  By  Lemma 
2.1  this  sequence  is  bounded  and,  therefore,  there  must  be  a  k 
such  that  T*  =  T*+*.  If  this  happens,  then  Tk  =  T*+^  for  all  j. 


3.  Examples 


Algorithm  2.2  was  tested  on  three  different  pictures. 
Figures  1,3,  and  5  show  the  original  pictures  together  with 
their  histograms.  The  results  for  numbers  of  classes  equal 
to  8,4,3,  and  2  for  Figures  1  and  3  are  shown  in  Figures  2 
and  4  ( (a) , (b) , (c) ,  and  (d) ,  respectively) .  Figure  6  shows 


the  result  for  Figure  5  when  we  consider  only  two  classes 


evenly  spread  in  the  interval  [LO,UP] .  The  number  of  iterations 
together  with  the  thresholds  are  given  in  Table  1. 


Number  of 
iterations 


Number  of 
classes 


Table  1.  Number  of  iterations  and  thresholds  for  each  figure 


Moreover,  the  maximum  number  of  steps  is  limited  by  the  size 
of  the  interval  [LO,UP].  // 

Observation; 

Although  we  can  prove  the  convergence  of  the  algorithm, 
at  least  for  two  classes,  this  does  not  mean  that  the  algorithm 
always  arrives  at  the  same  threshold,  irrespective  of  the  initial 
values  for  the  means. 


4 .  Comments 


In  this  note  we  showed  that  the  ISODATA  algorithm  can  be 
used  not  only  for  thresholding  a  picture,  as  in  [1] ,  but  also 
to  requantize  it  into  a  few  gray  levels.  In  the  latter  appli¬ 
cation,  it  seems  that  one  can  achieve  reasonable  data  compression 
without  significant  distortion  of  the  image,  as  in  [3],  However, 
an  advantage  over  the  method  proposed  in  [3]  is  that  one  can 
specify  a  priori  the  number  of  gray  levels  desired. 
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Figure  6.  Result  of  thresholding  (two  classes). 
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